[mlir][ByteCodeReader] Fix bugs in EncodingReader::alignTo #140660

PeimingLiu · 2025-05-20T02:36:43Z

It seems that the number of padding value should be computed based on the relative distance between ptr and buffer.begin() (instead of the absolute address of ptr). Otherwise, the skipped padding value will be indeterministic and depends on the initial alignment of buffer.begin()?

Or is there any implicit assumption on the alignment of buffer that holds the bytecode?

See

llvm-project/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp

Line 198 in 755acb1

size_t paddingSize = llvm::alignTo(curOffset, alignment) - curOffset;

for how ByteCoderWriter emits padding values.

It seems that the number of padding value should be computed based on the offset of the current pointer to the bytecode buffer begin (instead of the absolute address of `ptr`). Otherwise, the skipped padding value will be dependent on the alignment of `buffer.begin()`?

llvmbot · 2025-05-20T02:37:17Z

@llvm/pr-subscribers-mlir-core

@llvm/pr-subscribers-mlir

Author: Peiming Liu (PeimingLiu)

Changes

It seems that the number of padding value should be computed based on the relative distance between ptr and buffer.begin() (instead of the absolute address of ptr). Otherwise, the skipped padding value will be indeterministic and depends on the alignment of buffer.begin()?

Or is there any implicit assumption on the alignment of buffer that holds the bytecode?

Full diff: https://github.com/llvm/llvm-project/pull/140660.diff

1 Files Affected:

(modified) mlir/lib/Bytecode/Reader/BytecodeReader.cpp (+3-2)

diff --git a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
index 1052946d4550b..793170dcc594c 100644
--- a/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+++ b/mlir/lib/Bytecode/Reader/BytecodeReader.cpp
@@ -107,7 +107,8 @@ class EncodingReader {
       return emitError("expected alignment to be a power-of-two");
 
     auto isUnaligned = [&](const uint8_t *ptr) {
-      return ((uintptr_t)ptr & (alignment - 1)) != 0;
+      unsigned offset = ptr - buffer.begin();
+      return (offset & (alignment - 1)) != 0;
     };
 
     // Shift the reader position to the next alignment boundary.
@@ -1506,7 +1507,7 @@ class mlir::BytecodeReader::Impl {
     UseListOrderStorage(bool isIndexPairEncoding,
                         SmallVector<unsigned, 4> &&indices)
         : indices(std::move(indices)),
-          isIndexPairEncoding(isIndexPairEncoding){};
+          isIndexPairEncoding(isIndexPairEncoding) {};
     /// The vector containing the information required to reorder the
     /// use-list of a value.
     SmallVector<unsigned, 4> indices;

River707 · 2025-05-20T03:27:35Z

The alignment in the reader must be for the actual pointer address, because it gets used to directly load/read data that has alignment requirements. There is an expectation that whoever created the buffer (by alloc or mmap) ensures that the alignment is at least to the amount expected, though in a majority of cases (unless you have buffers with large alignment constraints) it already is given malloc gives you at least max_align_t

River707

Marking RC because this is not the expected reader behavior.

PeimingLiu · 2025-05-20T03:52:23Z

Make sense, that was why I had the question "Or is there any implicit assumption on the alignment of buffer that holds the bytecode?"

at least to the amount expected,

What is the expected amount (alignof(max_align_t))? Also, does it make sense to return an Error when the buffer is not aligned as expected? This is quite hard to dig when error occurs.

I might have missed important pieces, but it still seems somehow unsafe. E.g.,

// In ByteCodeWriter.cpp
Section_1 = {/*size = 16 bytes*/} 
Section_2 = {/*alignment=32 bytes =>*/padding = char[16]}

// In ByteCodeReader.cpp

// content alignof(16) say, ptr = 0x10
Section_1 = {/*size = 16 bytes*/} 
// ptr = 0x20 (already aligned to 32)
Section_2 = {/*pad value loaded as section data*/}.

It is only guaranteed to be safe when the buffer is aligned to the maximum section alignments IIUC.

llvmbot added mlir:core MLIR Core Infrastructure mlir labels May 20, 2025

PeimingLiu requested a review from chsigg May 20, 2025 02:54

River707 requested changes May 20, 2025

View reviewed changes

PeimingLiu closed this May 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][ByteCodeReader] Fix bugs in EncodingReader::alignTo #140660

[mlir][ByteCodeReader] Fix bugs in EncodingReader::alignTo #140660

Uh oh!

PeimingLiu commented May 20, 2025 •

edited

Loading

Uh oh!

llvmbot commented May 20, 2025 •

edited

Loading

Uh oh!

River707 commented May 20, 2025 •

edited

Loading

Uh oh!

River707 left a comment

Uh oh!

PeimingLiu commented May 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[mlir][ByteCodeReader] Fix bugs in EncodingReader::alignTo #140660

[mlir][ByteCodeReader] Fix bugs in EncodingReader::alignTo #140660

Uh oh!

Conversation

PeimingLiu commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

River707 commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

River707 left a comment

Choose a reason for hiding this comment

Uh oh!

PeimingLiu commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PeimingLiu commented May 20, 2025 •

edited

Loading

llvmbot commented May 20, 2025 •

edited

Loading

River707 commented May 20, 2025 •

edited

Loading

PeimingLiu commented May 20, 2025 •

edited

Loading